Math Parser
Volume Number: 7
Issue Number: 5
Column Tag: Pascal Procedures
A Practical Parser
By Bill Murray, Annandale, VA
Introduction
This article describes a practical mathematical parser - an interactive program
that can be used to perform basic arithmetic operations and standard function
calculations. The heart of the program is a Pascal function [eval] which takes a
mathematical expression [a str255 variable] as input and returns a real number [the
result of the calculation]. Other binary operations and standard functions can be added
to the code if one wishes to expand the program. By adding the Pascal code for eval to
software code, an existing program can be made more flexible.
When I first started thinking about writing code to calculate a mathematical
expression, I soon realized that it wasn’t that easy [at least for me]. I had to delve into
such concepts as parsing, tokens, lexical analysis, node trees and tables, etc.
After searching the literature I came across an outline of an algorithm for
parsing arithmetic expressions in Principles of Systems Programming by Robert M.
Graham (Wiley, 1975). It used the four binary operators of multiplication, division,
addition, and subtraction, set precedence values for operator tokens, and provided for
handling parentheses. Basically, the rules of the algorithm were the same ones we
learned in algebra. For example: multiply (or divide) before adding or subtracting;
start with inner parentheses and work outwards; if a pair of binary operators have
equal precedence, perform the left operator of the pair first.
The above algorithm was the starting point for developing my own parser. I
extended it to include the exponentiation operator and some standard functions [such as
sin, cos, etc.].
The main advantage in having a mathematical parser as part of a program is that
it allows a user greater flexibility. Expressions [in lieu of numbers] can be entered
from the keyboard. As an example, in fitting a least squares model to data, only the
function would have to be typed in. Any names could be used for the variables, since the
lexical analysis phase of the parser identifies constants and variables in an expression.
Once the variables were identified they could then be solved for in the least squares
sense. Also, by entering key words or phrases, a program could be made to branch to
other areas, such as a plotting routine or a Fast Fourier Transform routine. The user
could then return to the interactive mode of the parser by selecting an appropriate
menu item.
In addition to the above, text files containing a number of mathematical
expressions can be created and saved. If the name of a particular file is subsequently
entered in an expression, all statements in that file will be executed and a real variable
created which has the same name as the text file and a value equal to the result of the
calculation. In this way, one can perform specific calculations, using the text file as one
would a mini-program. Extended real variables can be defined, listed, and stored for
further sessions, and the number of decimal places can be set by the user.
The Pascal program listed in the appendix is a watered-down version of one
which is obtainable on disk. Both were written using THINK Pascal 2.0, System 4.2, and
Finder 6.0. The project contains 10 units. In the extended program, 20 textfiles can be
created, modified and stored for further use. In order to conserve on heap space,
variable arrays are dynamically allocated through the use of handles.
Finally, it should be noted that the tokens need not be restricted to real numbers.
That is, one could use the present algorithm with some minor modification to handle
matrices and vectors.
Let’s get started by defining some terms. Then we will describe the procedures
and functions which make up eval.
Parser
Parsing refers to the process of breaking something down into parts, explaining
the form, function, and interrelation of the parts. In diagramming a sentence, we break
it down into nouns, verbs, adjectives, etc. In this way we gain more insight into the
meaning of an expression. Parsing is, essentially, a recognizing of structure.
A mathematical parser is an algorithm that identifies the structure and
constituent parts of a mathematical expression. The expression, a string of characters
[str255], is comprised of a number of basic elements, such as symbols, words,
numbers, etc. The job of the parser is to recognize these elements [the operators and
operands], transform them into tokens, and then into a node table which can be used to
calculate the result of the expression, a real number.
Tokens
Tokens are symbols having the same form and size. Two pieces of information are
associated with each each - its type and its index within an array of tokens. In the
source code they are strings of 20 characters. Their types include: variable, constant,
real, node, binary, unary, and function.
LexicalAnalysis
The LexicalAnalysis procedure is the workhorse of the parsing process. It
identifies the basic elements in an expression, replaces them with tokens, determines